Maximum and top-k diversified biclique search at scale
نویسندگان
چکیده
Abstract Maximum biclique search, which finds the with maximum number of edges in a bipartite graph, is fundamental problem wide spectrum applications different domains, such as E-Commerce, social analysis, web services, and bioinformatics. Unfortunately, due to difficulty graph theory, no practical solution has been proposed solve issue large-scale real-world datasets. Existing techniques for clique search on general cannot be applied because objective two-dimensional, i.e., we have consider size both parts simultaneously. In this paper, divide into several subproblems each specified using two parameters. These are derived progressive manner, subproblem, can restrict very small part original graph. We prove that logarithmic enough guarantee algorithm correctness. To minimize computational cost, show how reduce significantly subproblem while preserving satisfying certain constraints by exploring properties one-hop two-hop neighbors vertex. Furthermore, study diversified top- k aims find maximal bicliques cover most total. The basic idea repeatedly remove it from times. design an efficient considers share computation cost among results, based deriving same results. further propose optimizations accelerate pruning space constraint refining candidates lazy manner. use real datasets various application one contains over 300 million vertices 1.3 billion edges, demonstrate high efficiency scalability our solution. It reported 50% improvement recall achieved after applying method Alibaba Group identify fraudulent transactions their e-commerce networks. This demonstrates usefulness practice.
منابع مشابه
Diversified Top-k Similarity Search in Large Attributed Networks
Given a large network and a query node, finding its top-k similar nodes is a primitive operation in many graphbased applications. Recently enhancing search results with diversification have received much attention. In this paper, we explore an novel problem of searching for top-k diversified similar nodes in attributed networks, with the motivation that modeling diversification in an attributed...
متن کاملDiversified Top-k Partial MaxSAT Solving
We introduce a diversified top-k partial MaxSAT problem, a combination of partial MaxSAT problem and enumeration problem. Given a partial MaxSAT formula F and a positive integer k, the diversified top-k partial MaxSAT is to find k maximal solutions for F such that the k maximal solutions satisfy the maximum number of soft clauses of F . This problem can be widely used in many applications inclu...
متن کاملDiversified Top-k Graph Pattern Matching
Graph pattern matching has been widely used in e.g., social data analysis. A number of matching algorithms have been developed that, given a graph pattern Q and a graph G, compute the set M(Q,G) of matches of Q in G. However, these algorithms often return an excessive number of matches, and are expensive on large real-life social graphs. Moreover, in practice many social queries are to find mat...
متن کاملInapproximability of Maximum Edge Biclique, Maximum Balanced Biclique and Minimum k-Cut from the Small Set Expansion Hypothesis
The Small Set Expansion Hypothesis (SSEH) is a conjecture which roughly states that it is NPhard to distinguish between a graph with a small set of vertices whose expansion is almost zero and one in which all small sets of vertices have expansion almost one. In this work, we prove conditional inapproximability results for the following graph problems based on this hypothesis: Maximum Edge Bicli...
متن کاملDiversified Top-k Keyword Query Interpretation on Knowledge Graphs
Exploring a knowledge graph through keyword queries to discover meaningful patterns has been studied in many scenarios recently. From the perspective of query understanding, it aims to find a number of specific interpretations for ambiguous keyword queries. With the assistance of interpretation, the users can actively reduce the search space and get more relevant results. In this paper, we prop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Vldb Journal
سال: 2022
ISSN: ['0949-877X', '1066-8888']
DOI: https://doi.org/10.1007/s00778-021-00681-6